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Abstract 

When implementing regular enough functions (e.g., elementary or spe- 
cial functions) on a computing system, we frequently use polynomial ap- 
proximations. In most cases, the polynomial that best approximates (for a 
given distance and in a given interval) a function has coefficients that are 
not exactly representable with a finite number of bits. And yet, the poly- 
nomial approximations that are actually implemented do have coefficients 
that are represented with a finite - and sometimes small - number of bits: 
this is due to the finiteness of the floating-point representations (for soft- 
ware implementations), and to the need to have small, hence fast and/or 
inexpensive, multipliers (for hardware implementations). We then have 
to consider polynomial approximations for which the degree-i coefficient 
has at most m, fractional bits (in other words, it is a rational number with 
denominator 2 mi ). We provide a general method for finding the best poly- 
nomial approximation under this constraint. Then, we suggest refinements 
than can be used to accelerate our method. 

Introduction 

All the functions considered in this article are real valued functions of the real 
variable and all the polynomials have real coefficients. After an initial range 
reduction step 1 9 8 3 [, the problem of evaluating a function ip in a large domain 
on a computer system is reduced to the problem of evaluating a possibly dif- 
ferent function / in a small domain, that is generally of the form [0, a], where 
a is a small nonnegative real. Polynomial approximations are among the most 
frequently chosen ways of performing this last approximation. 

Two kinds of polynomial approximations are used: the approximations 
that minimize the "average error," called least squares approximations, and the 
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approximations that minimize the worst-case error, called least maximum ap- 
proximations, or minimax approximations. In both cases, we want to minimize a 
distance | \p — f\\, where p is a polynomial of a given degree. For least squares 
approximations, that distance is: 

/ ra \ 1/2 

||p-/||2,[o,o] = ( J w{x){f{x) -p{x)f dx\ 

where w is a continuous weight function, that can be used to select parts of 
[0, a] where we want the approximation to be more accurate. For minimax 
approximations, the distance is: 

Hp- /IU,[o,o] = \p( x ) ~f( x )\- 

0<x<a 

The least squares approximations are computed by a projection method us- 
ing orthogonal polynomials. Minimax approximations are computed using an 
algorithm due to Remez J^J|5|. See 0EI for recent presentations of elementary 
function algorithms. 

In this paper, we are concerned with minimax approximations. Our ap- 
proximations will be used in finite-precision arithmetic. Hence, the computed 
polynomial coefficients are usually rounded: the coefficient pi of the minimax 
approximation 

p(x) = po+ pix H + p„x n 

is rounded to, say, the nearest multiple of 2~" H . By doing that, we obtain a 
slightly different polynomial approximation p. But we have no guarantee that p is 
the best minimax approximation to f among the polynomials whose degree i coefficient 
is a multiple of2~ mi . The aim of this paper is to give a way of finding this "best 
truncated approximation". We have two goals in mind: 

• rather low precision (say, around 15 bits), hardware-oriented, for specific- 
purpose implementations. In such cases, to minimize multiplier sizes 
(which increases speed and saves silicon area), the values of rm, for i > 1, 
should be very small. The degrees of the polynomial approximations are 
low. Typical recent examples are given in 1191 1101 . Roughly speaking, 
what matters here is to reduce the cost (in terms of delay and area) with- 
out making the accuracy unacceptable; 

• single-precision or double-precision, software-oriented, general-purpose 
implementations for implementation on current microprocessors. Using 
Table-driven methods, such as the ones suggested by Tang H5lll6llT7lll8l . 
the degree of the polynomial approximations can be made rather low. 
Roughly speaking, what matters in that case is to get very high accuracy, 
without making the cost (in terms of delay and memory) unacceptable. 

The outline of the paper is the following. We give an account of Chebyshev 
polynomials and some of their properties in Section ^ Then, in Section |21 we 
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provide a general method that finds the "best truncated approximation" of a 
function / over a compact interval [0, a]. We finish with two examples. 

Our method is implemented in Maple programs that can be downloaded 
from http : / / www . ens-lyon . f r/ ~nbriseba/ trunc . html We plan to 
prepare a C version of these programs which should be much faster. 



1 Some reminders on Chebyshev polynomials 

Definition 1 (Chebyshev polynomials) The Chebyshev polynomials can be defined 
either by the recurrence relation 

T (x) = 1 

Ti(x) = x (1) 
T n (x) = 2xT n ^t(x) - T„_ 2 (a;); 



or by 



) cos (n cos 1 x) (|x| < 1) 
n{ j_1 cosh (n cosh- 1 ar) (x > 1). 1 ' 



The first Chebyshev polynomials are listed below. 

T (x) = 1, 

Ti(x) = x, 

T 2 (x) = 2x 2 - 1, 

T 3 (a;) = 4x 3 -3x, 

T 4 (x) = 8x 4 -8x 2 + l, 

T 5 (a:) = 16a; 5 - 20x 3 + 5x. 

An example of Chebyshev polynomial (T7) is plotted in Fig.^ 
These polynomials play a central role in approximation theory. Among 
their many properties, the following ones will be useful in the sequel of this 
paper. A presentation of the Chebyshev polynomials can be found in [1 1 and 
especially in 1141 . 

Property 1 For n > 0, we have 

k=0 y ' 

Hence, T n has degree n and its leading coefficient is 2 n ~ 1 . It has n real roots, all 
strictly between —1 and 1. 

Property 2 There are exactly n + 1 values xq, X\, x 2 , ■ ■ • , x n such that 
— 1 = xq<xi<x 2 <--- < x n = 1, 
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Figure 1: Graph of the polynomial TV (a:). 



which satisfy 

T n ( Xi ) = (-l) n_i max \T n {x)\ Vi, i = 0, . . . , n. 
xe[— 1,1] 

TTzaf j's, f/ze maximum absolute value ofT n is attained at the x/s, and the sign ofT n 
alternates at these points. 

We recall that a monic polynomial is a polynomial whose leading coefficient 
isl. 

Property 3 (Monic polynomials of smallest norm) Let a, b e R, a < b. The 

monic degree-n polynomial having the smallest 1 1- 1 |oo,[«.,6] norm in [a,b] is 

(b-aT ( 2x-b-a \ 
2 2 ™-i n \ b-a J' 

The central result in polynomial approximation theory is the following the- 
orem, due to Chebyshev. 

Theorem 1 (Chebyshev) Let a, b e M, a < b. The polynomial p is the minimax 
approximation of degree < nto a continuous function f on [a, b] if and only if there 
exist at least n + 2 values 

a < x < x\ < x 2 < ■ ■ ■ < x n+ i < b 
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such that: 

p(Xi) - f(Xi) = {-I) 1 \p(x ) - f(x )] = ±||/ -p||oo,[o,6]- 

Throughout the paper, we will make frequent use of the polynomials 

T*{x)=T n {2x-l). 

The first polynomials T* are given below We have (see |4, Chap. 3] for ex- 
ample) T*(x) = T 2n (a; 1 / 2 ), hence all the coefficients of T* are nonzero integers. 

Tq{x) = 1, 

rf(ar) = 2a; -1, 

T|(a;) = 8x 2 -8x+l, 

T 3 *(a;) = 32a; 3 - 48a; 2 + 18a; - 1, 

T* (x) = 128a; 4 - 256a; 3 + 160a; 2 - 32a; + 1, 

T 5 *(a;) = 512a; 5 - 1280a; 4 + 1120a; 3 - 400a; 2 + 50a; - 1 . 

Theorem 2 (Polynomial of smallest norm with degree-Zc coefficient equal to 1.) 

Let a £ (0, +oo), de/i'ne 

/3 + fax + fox 2 + ■■■ + /3 n x n = T* Q . 
Let k be an integer, < k <n,the polynomial 

±T* ( ^ . 

/las f/ze smallest |.| |oo.[o.a] norm in [0, a] among the polynomials of degree at most n 
with a degree-k coefficient equal to 1. That norm is |l//3fc|. 

Moreover, when k = n — or 1 < k < n, this polynomial is the only one having 
this property. 

Proving this theorem first requires the following results. 
Proposition 1 Let (<5j)i=o,...,n be an increasing sequence of nonnegative integers and 

P{x) = a x So H h a n x s " e R[x], 

then either P = or P has at most n zeros in (0, +oo). 

Proof. By induction on n. For n = 0, it is straightforward. Now we assume 
that the property is true until the rank n. Let P(x) — aox s ° + • • • + a n x Sn + 
a n +ix Sn+1 6 R[x] with < So < ■ ■ ■ < 5 n+ i and a^cii . . . a n +i ^ 0. Assume that 
P has at least n + 2 zeros in (0, +oo). Then Pi — P/x s ° has at least n + 2 zeros 
in (0, +oo). 

Thus, the nonzero polynomial P{(x) = (Si — Sa)aiX Sl ~ s ° + • • • + (<5„+i — 
<5o)an+ia; <5 " +1_50 has, from Rolle's Theorem, at least n + 1 zeros in (0, +oo), 
which contradicts the induction hypothesis. □ 
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Corollary 1 Let k be an integer, 1 < k < n, and 

71 

P(x) = ^Teja? S R[x\. 

3 = 

If P has at least n zeros in [0, +00) and at most a simple zero in 0, then P = 0. 

Proof. If -P(O) 7^ 0, then P has at least n zeros in (0, +oo), hence P = from 
Proposition Suppose now that P(0) = 0. We can rewrite P as P(x) — 

n 

ejX J . As P has at least n — 1 zeros in (0, +00), it must yet vanish identi- 
cally from Proposition^] □ 

Proof of Theorem|2| We give the proof in the case a = 1 (the general case is a 
straightforward generalization). 

The case k = n = is straightforward. 

n 

Denote P^(a;) = afcx*. From PropertyEI there exist = 770 < ?/i < ■ • ■ < 
r\ n = 1 such that 

•rtfo) = H ll^lloo.lo.i] = % 1 (- 1 ) n " i - 

Now, we assume 1 < k < n. This part of the proof follows step by step 

n 

the proof of Theorem 2.1 in 1141 . Let q(x) — CjX^ € M[x] satisfy ||x fe — 

3=0, 

9( a; )l|oo,[o,i] — l a fe 1 |- We suppose that x k — q =/= T*. Then the polynomial 

n 

P(x) = a^ 1 T*(x) — (.x fe —q(x)) has the form d^a?' and is not identically zero. 

j=o, 

Hence there exist i and j,0<i<j<n, such that P(?7o) = ■ • • = P(?/;_i) = 
0, P(r?i) 7^ and P(^) 7^ 0, P(?7 J+ i) = ■ • • = P(rj n ) = (otherwise, the nonzero 
polynomial P would have at least n distinct roots in [0, 1] which would contra- 
dict CorollaryG). Let I such that P(r)i) ^ then sgn P(r?{) = sgn a^ x T*{rn) = 
(-1)"-' sgn a" 1 . Let m such that P(t ?/ ) ^ 0, P(t?/+i) = • • • = P(r? i+m -i) = 0, 
P(r)i+ m ) 7^ : P has at least m — 1 zeros in [77/, ?7/+ m ]. We distinguish two cases: 

• If to is even, we have sgn P(r/;) = sgn P(r)i +m ) and thus, P must have an 
even number of zeros (counted with multiplicity) in [rji, T)i+ m ]. 

• If m is odd, we have sgn P(rji) = — sgn P(rji +m ) and thus, P must have 
an odd number of zeros (counted with multiplicity) in [77/, ?7/+m]- 

In both cases, we conclude that P has at least to zeros in [rji, ?7/+ m ]. 
Then P has at least j — i zeros in [rji, rjj]. Finally, P has not less than i + ( j — 
i) + n — j = n zeros in [0, 1] (P has at least i zeros in [770, rji) and at least n — j 
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zeros in (rjj ,rj n ]). Note that we also obtained that P has no less than n — 1 zeros 
in (0, 1]. As P is nonzero, this contradicts Corollary^ 

n 

To end, we assume k = and n > 1. Let = ^^Cja;-' € satisfy 

3=1 

|| 1 — <z(a;)||oo,[o,i] < l a o 1 |- Then the polynomial P(x) — cLq T*(x) — (1 — q(x)) 

n 

has the form elja^ and is not identically zero. This polynomial changes sign 

3=1 

between any two consecutive extrema of T*, hence it has at least n zeros in 
(0, 1). As it cancels also at 0, we deduce that P vanishes identically, which is 
the contradiction desired. □ 

Remark 1 When k — and n > 1, it is not possible to prove unicity: for example, 
let a = 1, k = 0, n = 1, the polynomials 1 - Ax with A e [0, 2] have all a \.\ |oo,[o,i] 
norm equal to 1, 



2 Getting the "truncated" polynomial that is closest 
to a function in [0, a] 

Let a G (0, +00), let / be a function defined on [0, a] and mo, mi, . . ., m„ be 
n + 1 integers. Define •pj™°> mi >— > m "l as me se f- G f me polynomials of degree 
less than or equal to n whose degree-i coefficient is a multiple of 2~" li for all i 
between and n (we will call these polynomials "truncated polynomials"). 

Let p be the minimax approximation to / on [0, a]. Define p as the polyno- 
mial whose degree-i coefficient is obtained by rounding the degree-i coefficient 
of p to the nearest multiple of 2" mi (with an arbitrary choice in case of a tie) for 
i = 0, . . . , n: p is an element of p^ * 1711 '— > m ™\ 

Also define e and e as 

e = 11/ -pIU,[o,q] and e = ||/ -p|U,[ ,a]- 
We assume that e / 0. Let A £ [4, l], we are looking for a truncated poly- 
nomial p* g pj^."*."..™-] S uch that' 

11/ -P*||oo,[o,a] = min , ||/-3||oo,[0, a ] 

gg7 ,J.m ,m 1 ,...,m„] 

and 



||/-P*||oo,[0,o] < M\f -Pl|oo,[0,a]- (3) 

When A = 1, this problem has a solution since p satisfies J3l- It should be 
noticed that, in that case, p* is not necessarily equal to p. 

In the following, we compute bounds on the coefficients of a polynomial 

q g p^,m u ...,m n ] suchthaHf 

q is not within these bounds, then 

||p-?Hoo,[o,a] > e + Ae. 
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Knowing these bounds will allow an exhaustive searching of p* . To do that, 
consider a polynomial q whose degree-i coefficient is pi + <5,;, with Si ^ 0. Let 
us see how close can q be to p. We have 

(q - p){x) = 6. t x l + ^2 (ij ~ Pj) xJ ■ 

0<j<n, 

Hence, | \q — p\ |oo,[o,a] is minimum implies that 

l^ + T" (ij -Pj)x J \\oc,[0,a] 

* 0<3<n, 

is minimum. 

Hence, we have to find the polynomial of degree n, with fixed degree-i 
coefficient, whose norm is smallest. This is given by Theorem|2] Therefore, we 
have 

% 0<j<n, 

where /3j is the nonzero degree-i coefficient of T*(x/a). Therefore, we must 
have 

Ik-P||oo,[0,o] > t||y 

Now, if a polynomial is at a distance greater than e + Ae from p, it cannot be 
p* since 

||?-/||oo,[0,o] > Ik-P||oo,[0,a] - lb~ /Hoo,[0,a] > Ae. 

Therefore, if there exists i, < i < n, such that 

\S t \ > (e + Ae)|A| 

then | \q — p\\ > e + Ae and therefore q ^ p*. Hence, the degree-i coefficient of p* 
necessarily lies in the interval [pi — (e + Ae) |, pi + (e + Ae)|/?» |]. Thus we have 

\2^{ Pl - (e + Ae)|A|)l < ^p* < [2 m >( Pl + (e + Ae)|A'|)J, (4) 

S „ ' S v ' 

mi Mi 

since 2 mi p* is a rational integer: we have Mj — + 1 possible values for the 
integer 2 mi p*. This means that we have n™=o(^ ^ m i + 1) polynomials can- 
didates. If this amount is small enough, we search for p* by computing the 
norms ||/ — q||oo.[o.a]/ Q running among the possible polynomials. Otherwise, 
we need an additional step to decrease the number of candidates. Hence, we 
give now a method for this purpose. 
Condition means 

n 

/(z) ~ Ae < 5>^ ; < /(x) + Ae (5) 

i=0 
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for all x G [0, a]. In particular, we have 



/(0) - Ae < pi < /(0) + Ae 
i.e., since 2 m °pQ is an integer, 

\2 m °(f(0) - Ae)] < 2">$ < L2 mn (/(0) + Ae)J. 

The n + 1 inequations given by 0) define a polytope to which the numerators 
(i.e. the 2 mi p*) belong. The idea is to try to make this polytope smaller in order 
to reduce our final exhaustive search. We do that thanks to inequations O con- 
sidered for a certain number (chosen by the user) of values of £ G [0, a]. Once 
we got a small enough polytope, we start our exhaustive search using libraries 
(such as Polylib 1 11 1 and CLooG |2 [) specially designed for scanning efficiently 
the integer points of polytopes and producing only the corresponding loops 
in our program of exhaustive search. CLooG implements the Quillere et al. 
algorithm \12\. 

3 Examples 

We implemented in Maple a weakened version of the process described in the 
previous section. By this, we mean that in the step of refinement of the poly- 
tope, we only determine its vertices using the simplex method instead of scan- 
ning its integer points. The program first computes the bounds obtained from 
Chebyshev polynomials and then, if these bounds are too large, computes the 
vertices of the polytope obtained from inequations H) and inequations con- 
sidered for Xi = 2 A where d is an integer parameter chosen by the user, i an 
integer, < i < d and A is a rational number "close" and less than or equal to 
a. 

3.1 Cosine function in [0, 7r/4] with a degree-3 polynomial 

In [0, 7r/4], the distance between the cosine function and its best degree-3 min- 
imax approximation is 0.00011. This means that such an approximation is not 
good enough for single-precision implementation of the cosine function. It can 
be of interest for some special-purpose implementations. In this example, the 
bounds given by the first step (associated to Chebyshev polynomials) are good 
enough to avoid the use of the polytope refinement. 

>m := [12, 10, 6, 4] :polstar (cos, Pi/4, 3,m) ; 
"minimax = ", .9998864206 

+ (.00469021603 + (-.53030 8 8 6 65 + .06304636099 x) x) x 
"Distance between f and p =", .0001135879209 
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3 17 2 5 

"hatp = ", - x - — x + x + 1 

32 1024 



"Distance between f and hatp =", .0006939707 



>Do you want to continue (y; /n; ) ? y; 
>Enter the value of parameter lambda: 1/2; 

degree 0: 4 possible values between 2047/2048 and 
4097/4096 

degree 1: 22 possible values between -3/512 and 
15/1024 

degree 2: 5 possible values between -9/16 and 
-1/2 

degree 3: 1 possible values between 1/16 and 
1/16 

440 polynomials need be checked 

>Do you want to try to refine the bounds (y;/n;)?n; 



1 3 17 2 3 4095 

'pstar = " , — x - — x + x + 

16 32 512 4096 



"Distance between f and pstar =", .0002441406250 
"Time elapsed (in seconds) =", 1.840 

In this example, the distance between / and p* is approximately 0.35 times 
the distance between / and p. Using our method saves around - log 2 (0.35) « 
1.5 bits of accuracy. 



3.2 Exponential function in [0, log(l + 1/2048)] with a degree-3 
polynomial 

In [0, log(l + 1/2048)], the distance between the exponential function and its 
best degree-3 minimax approximation is around 1.8 x 10~ 17 , which should be 
sufficient for a faithfully rounded double precision implementation provided 
there is much care in the polynomial evaluation. The bounds given to get p* 
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using the first step are too large (there are 18523896 polynomials to test). Hence, 
we must use the polytope refinement. 

>Digits :=30 : 

>m := [56,45,33,23]: polstar (exp, log ( 1 . +1 . 12 4 8 ) , 3 , m) ; 

"minimax = ", .999999999999999981509827946165 + 
(1.00000000000121203815619648271 

+ (.4999999875860 630303204 93 910112 

+ .166707352549861488779274879363 x) x) x 

-16 

"Distance between f and p =", .1849017208895 10 

1398443 3 4294967189 2 35184372088875 

"hatp = ", x + x + x 

8388608 8589934592 35184372088832 



72057594037927935 



72057594037927936 



"Distance between f and hatp =", 

-16 

.23624220969326235229443 10 

>Do you want to continue (y;/n;)? y; 
>Enter the value of parameter lambda: 1; 

degree 0: 6 possible values between 
18014398509481983/18014398509481984 

and 72057594037 92 7937/72057594037 92 7936 
degree 1: 109 possible values between 
35184372088821/35184372088832 

and 35184372088 92 9/35184372088832 
degree 2: 146 possible values between 4294967117/8589934592 

and 2147483631/4294967296 
degree 3: 194 possible values between 699173/4194304 

and 1398539/8388608 
18523896 polynomials need be checked 

>Do you want to try to refine the bounds (y;/n;)?y; 
>Enter the value of parameter d: 25; 
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degree 0: 2 possible values between 
72057594037927935/72057594037927936 
and 1 

degree 1: 27 possible values between 35184372088857/35184372088832 

and 35184372088883/35184372088832 
degree 2: 32 possible values between 536870897/1073741824 

and 4294967207/8589934592 
degree 3: 44 possible values between 1398421/8388608 

and 21851/131072 
76032 polynomials need be checked 

>Do you want to try to refine the bounds (y;/n;)?n; 
>Do you want to change the value of Digits (y;/n;)?y; 
>Enter the value of Digits: 21; 

1398443 3 2147483595 2 35184372088873 

"pstar = ", x + x + x 

8388608 4294967296 35184372088832 

72057594037927935 

+ 

72057594037927936 

"Distance between f and pstar =", 

-16 

.20246280367096470182285 10 
"Time elapsed (in seconds) =", 54721.961 

In this last example, the distance between / and p* is approximately 0.85 
times the distance between / and p. Using our method saves around — log 2 (0.85) « 
0.22 bits of accuracy. 
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